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A new indicator, a real valued s-index, is suggested to characterize a quality and impact of the 
scientific research output. It is expected to be at least as useful as the notorious /i-index, at the 
same time avoiding some its obvious drawbacks. 
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- - - Sound, sound your trumpets and beat your drums! Here it is, an impossible thing performed: a single integer 
, number characterizes both productivity and quality of a scientific research output. Suggested by Jorge Hirsch [1], this 
' simple and intuitively appealing /i-index has shaken academia like a storm, generating a huge public interest and a 
p\j ' number of discussions and generalizations 0, [3, H 0, H 0, E, U | • 

A Russian physicist with whom I was acquainted long ago used to say that the academia is not a Christian 
[ environment. It is a pagan one, with its hero-worship tradition. But hero-worshiping requires ranking. And a simple 
' indicator, as simple as to be understandable even by dummies, is an ideal instrument for such a ranking. 

/i-index is defined as given by the highest number of papers which has received h or more citations. Empirically, 



h^J^, (1) 



with a ranging between three and five [J]. Here C'tot stands for the total number of citations. 
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O ] And now, with this simple and adorable instrument of ranking on the pedestal, I'm going into a risky business to 
suggest an alternative to it. Am I reckless? Not quite. I know a magic word which should impress pagans with an 
irresistible witchery. 
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^ ' Claude Shannon introduced the quantity 
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C/3 ■ N 
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S=-^Pilogpi, (2) 



which is a measure of information uncertainty and plays a central role in information theory [12] . On the advice of 
John Von Neumann, Shannon called it entropy. According to Feynman [l3| , Von Neumann declared to Shannon that 
^ I this magic word would give him " a great edge in debates because nobody really knows what entropy is anyway" . 

Armed with this magic word, entropy, we have some chance to overthrow the present idol. So, let us try it! Citation 



— , , 

I entropy is naturally defined by ([2]), with 

a 

o: *' 

On , where d is the number of citations on the z-th paper of the citation series. Now, in analogy with ([1]), we can define 
' the citation record strength index, or s-index, as follows 



^ , where 



1 SCtot ,„x 



So^logN 



is the maximum possible entropy for a citation series with N papers in total, corresponding to the uniform citation 
record with pi — 1/N. 

That's all. Here it is, a new index s afore of you. Concept is clear and the definition simple. But can it compete 
with the /i- index which already has gained impetus? I do not know. In fact, it does not matter much whether the 
new index will be embraced with delight or will be coldly rejected with eyes wide shut. I sound my lonely trumpet 
in the dark trying to relax at the edge of precipice which once again faces me. Nevertheless, I feel s-index gives more 
fair ranking than /i-index, at least in the situation considered below. 

Let us consider the following citation records from the SPIRES database (see Table |l|. What we can say about 
these unpersonalized records? Just looking at the numbers? I think the citation record A is good, B - average, C - 
quite remarkable, D - splendid, E - very good, and F - excellent. 
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258 


202 








13 


58 


12 
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25 
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145 


196 


627 
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87 


359 


273 
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8 


303 


345 


49 


1289 




2 


20 








775 


11 


2 








1001 





213 





1373 





345 


218 


106 











59 











TABLE L Some citation records from the SPIRES database. 



Not surprising, because D is for Paul Dirac and F - for Richard Feynman (as represented in the SPIRES database). 
You would be surprised to know who A is, but I postpone revealing his personality. I know physicist B very well and 
he is, alas, not-prominent (at least up to now). C is for Petr Hofava, who is well known for his articles with Edward 
Witten in string theory. At last E is for William Unruh, a well established researcher with many creative ideas. 

How all this is captured by h and s indexes? See Table HIl to be convinced that s-index conveys our intuitive ranking 
(emerging solely from the unpcrsonalized citation records, before revealing personalities of the physicists behind the 
records) better than /i-index. 



Citation 
Record 


Ctot 


S/So 


s-index 


/i- index 


A 


2375 


0.72 


20.7 


19 


B 


1647 


0.82 


18.4 


22 


C 


5070 


0.61 


27.8 


21 


D 


7550 


0.76 


37.8 


27 


E 


4044 


0.72 


26.9 


27 


F 


9984 


0.75 


43.2 


28 



TABLE IL Total citation number, relative entropy, s and h indexes for citation records from the Table U 



One thing which catches the eye is that citation series seem to be completely random. That is knowing even all 
citation numbers C,; for i = 1, . . . , n, we can not predict the next citation number Cn+i- In fact there exist an objective 
criterion to decide whether the series is random or deterministic. This criterion is also related to the magic word, 
entropy, and is called permutation entropy 14| . 

The permutation entropy is defined as follows. Let us split the citation series in overlapping groups of n elements 
in each group. For example, for n = 3 and citation record C we have the following groups 



{258,202,0}, {202,0,0}, {0,0,13}, {0,13,58}, {13,58,12}, {58,12,18}, {12,18,4}, 

{18,4,21}, {4,21,25}, {21,25,26}, {25,26,30}, {26,30,1778}, {30,1778,1316}, {1778,1316,219}, 

{1316,219,65}, {219,65,107}, {65,107,17}, {107,17,235}, {17,235,41}, {235,41,45}, {41,45,97}, 

{45,97,116}, {97,116,108}, {116,108,61}, {108,61,21}, {61,21,0}, {21,0,25}, {0,25,16} 

{25,16,20}, {16,20,7}, {20,7,6}, {7,6,6}, {6,6,9}, {6,9,10}, {9,10,17}, 

{10,17,21}, {17,21,26}, {21,26,14}. 
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For each group, there exists a unique permutation tt which orders the elements of this group in an ascending manner. 
For example, for the first group, {258, 202, 0}, the permutation which orders it is {3, 2, 1}, because < 202 < 258. In 
the second group, {202, 0, 0}, we have two equal elements. In such a case we assume that the first of them is in fact a 
bit smaller, according to greater time span and, therefore, low accumulation rate which it is assumed to correspond. 
Under such agreement, the order permutation of the {202,0,0} group is {2,3,1}. Analogously, we can find order 
permutations for other groups also and get the following table: 

TTe, TTS, TTl, TTl, TT2, 7^3, 7^2, 7^5, T^l, 7^1, 7Tl, TTl , 7T4, TTg , TTq, 7T3, 7T2, TTs, 7T4, TT^, TTl , TTl, 7Ti, TTg , TTg, TTq, 
7^b, 71'4, TTa, 712, T^G, 71^, TTl, TTl, TTl, TTl, TTl, 7r2 , 



where 



^1= {1,2,3}, ^2 = {3,1,2}, 713 = {1,2,3}, 714 = {1,3,2}, 7r5 = {2,1,3}, ^6 = {3,2,1}. 



Now we can calculate relative frequencies by which particular order permutations occur: pi = 13/38 = 0.342, 
P2 = 5/38 = 0.132, p3 = 6/38 0.158, = 4/38 0.105, = 3/38 = 0.079 and p& = 7/38 = 0.184. Then the 
permutation entropy of order n is given by the Shannon formula 

n\ 

S^;'^ = -Y^p^logp,. (4) 

i=l 

In our above example of the citation record C, we get Sp /log 6 — 0.93, where log 6 is the maximum permutation 
entropy for order n = 3, corresponding to the completely random series where each order permutation 7Ti will appear 
with equal frequency pi = 1/6. For other citation records, the normalized permutation entropies are even higher, 
ranging from 0.96 to 0.99. Taking in mind that citation series considered are relatively short, N ~ 100, I don't think 
that the deviations from unity are statistically significant. 

I even tried to overcome the principle of least effort [3], which leads to the fancy Zipf's law and analyze 
longer citation record of Edward Witten [N ~ 300). Having h — 141 and s — 136.7, this citation record has 
permutation entropy Sp /log 6 > 0.99. Even if we increase the order of the permutation entropy and take n — 4, 
we get almost the same result for the normalized permutation entropy: 5p"'''/log24 = 0.99. Therefore, we can safely 
conclude that citation series are expected to be random. An interesting question is whether self- citations introduce 
some deterministic element in the citation record and can be, therefore, to some extent detected by measuring the 
permutation entropy. 

It remains to surprise you by revealing that the physicist A is in reality Albert Einstein! Do you need a caveat more 
profound against taking all these indexes and the corresponding ranking of scientists too seriously? "Not everything 



that can be counted counts, and not everything that counts can be counted" 17[. Sound your trumpets 
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